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Abstract 



We study the problem of estimating the best B term Fourier representation for a given frequency-sparse signal 
(i.e., vector) A of length N » B. More explicitly, we investigate how to deterministically identify B of the largest 
magnitude frequencies of A, and estimate their coefficients, in polynomial{B ,\o^N) time. Randomized sub-linear time 
algorithms which have a small (controllable) probability of failure for each processed signal exist for solving this 
problem. However, for failure intolerant applications such as those involving mission-critical hardware designed to 
process many signals over a long lifetime, deterministic algorithms with no probability of failure are highly desirable. 
In this paper we build on the deterministic Compressed Sensing results ofCormode and Muthukrishnan ( CM) [26, 6, 7] 
in order to develop the first known deterministic sub-linear time sparse Fourier Transform algorithm suitable for 
failure intolerant applications. Furthermore, in the process of developing our new Fourier algorithm, we present 
a simplified deterministic Compressed Sensing algorithm which improves on CM's algebraic compressibility results 
while simultaneously maintaining their results concerning exponential decay. 



1 Introduction 

In many applications only the top few most energetic terms of a signal's Fourier Transform (FT) are of interest. 
In such applications the Fast Fourier Transform (FFT), which computes all FT terms, is computationally wasteful. 
To make our point, we next consider a simple application-based example in which the FFT can be replaced by faster 
approximate Fourier methods: 

Motivating Example: sub-Nyquist frequency acquisition 

Imagine a signal/function / : [0, 2ti[ — > C of the form 

fix) = C ■ e'"-" 

consisting of a single unknown frequency co e {-N,N] (e.g., consider a windowed sinusoidal portion of a wideband 
frequency-hopping signal [21]). Sampling at the Nyquist-rate would dictate the need for at least 2N equally spaced 
samples from / in order to discover co via the FFT without aliasing [3]. Thus, we would have to compute the FFT of 
the 2N-length vector 

A(7)=/(^|, 0</<2M 

However, if we use aliasing to our advantage we can correctly determine co with significantly fewer /-samples taken 
in parallel. 
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Consider, for example, the two-sample Discrete Fourier Transform of /. It has 



1 + (-1)"" -^ 1 + (-1)"-! 

m = C- \ ' and /(I) = C-—^f—. 

Clearly /(O) = implies that oj = 1 modulo 2 while /(I) = implies that co = modulo 2. In this fashion we may use 
several potentially aliased Fast Fourier Transforms in parallel to discover co modulo 3, 5, ... , the O(logN)* prime. 
Once we have collected these moduli we can reconstruct co via the famous Chinese Remainder Theorem (CRT). 

Theorem 1 Chinese Remainder Theorem (CRT): Any integer x is uniquely specified mod N by its remainders 
modulo m relatively prime integers pi,...,pm as long as YlHi pi > N. 

To finish our example, suppose that N = 500,000 and that we have used three FFT's with 100, 101, and 103 
samples to determine that co = 3A mod 100, co = 3 mod 101, and co = 1 mod 103, respectively. Using that co = 1 mod 
103 and we can see that co = 103 • a + 1 for some integer a. Thus, 

(103 • fl + 1) = 3 mod 101 =^2a = 2 mod 101 ^ a = 1 mod 101. 

Therefore, a = 101 • + 1 for some integer b. Substituting for a we get that co = 10403 • b + 104. By similar work we 
can see that b = 10 mod 100 after considering co modulo 100. Hence, co - 104, 134 by the CRT. As an added bonus 
we note that our three FFTs will have also provided us with three different estimates of a;'s coefficient C. 

The end result is that we have used significantly less than 2N samples to determine co. Using the CRT we required 
only 100 + 101 + 103 = 304 samples from / to determine a) since 100 ■ 101 ■ 103 > 1, 000, 000. In contrast, a million 
/-samples would be gathered during Nyquist-rate sampling. Besides needing significantly less samples than the FFT, 
this CRT-based single frequency method dramatically reduces required computational effort. And, it's deterministic. 
There is no chance of failure. Of course, a single frequency signal is incredibly simple. Signals involving more than 
1 non-zero frequency are much more difficult to handle since frequency moduh may begin to colhde modulo various 
numbers. DeaUng with the potential difficulties caused by such frequency collisions in a deterministic way comprises 
the majority of this paper. 

1.1 Compressed Sensing and Related Work 

Compressed Sensing (CS) methods [4, 28, 26, 6, 7] provide a robust framework for reducing the number of mea- 
surements required to estimate a sparse signal. For this reason CS methods are useful in areas such as MR imaging 
[23, 24] and analog-to-digital conversion [21, 20] where measurement costs are high. The general CS setup is as 
follows: Let A be an N-length signal/vector with complex valued entries and ^ be a fuU rank NxN change of basis 
matrix. Furthermore, suppose that W • A is sparse (i.e., only k<^N entries of W • A are significant/large in magnitude). 
CS methods deal with generating aKxN measurement matrix, M, with the smallest number of rows possible (i.e., K 
minimized) so that the k significant entries of W • A can be well approximated by the K-element vector result of 

MWA. (1) 

Note that CS is inherently algorithmic since a procedure for recovering W ■ A's largest fc-entries from the result of 
Equation 1 must be specified. 

For the remainder of this paper we will consider the special CS case where W is the N X N Discrete Fourier 
Transform matrix. Hence, we have 

= (2) 

Our problem of interest is to find, and estimate the coefficients of, the k significant entries (i.e., frequencies) of A given 
a frequency- sparse (i.e., smooth) signal A. In this case the deterministic Fourier CS measurement matrixes, M ■ 
produced by [28, 26, 6, 7] require super-Unear 0(KZV)-time to multiply by A in Equation 1. Similarly, the energetic 
frequency recovery procedure of [4, 9] requires super-Unear time in N. Hence, none of [4, 28, 9, 26, 6, 7] have both 
sub-linear measurement and reconstruction time. 



Existing randomized sub-linear time Fourier algorithms [15, 19, 16] not only show great promise for decreasing 
measurement costs, but also for speeding up the numerical solution of computationally challenging multi-scale prob- 
lems [8, 18]. However, these algorithms are not deterministic and so can produce incorrect results with some small 
probability on each input signal. Thus, they aren't appropriate for long-lived failure intolerant applications. 

In this paper we build on the deterministic Compressed Sensing methods of Cormode and Muthukrishnan (CM) 
[26, 6, 7] in order to construct the first known deterministic sub-linear time sparse Fourier algorithm. In order to 
produce our new Fourier algorithm we must modify CM's work in two ways. First, we alter CM's measurement 
construction in order to allow sub-linear time computation of Fourier measurements via ahasing. Thus, our algorithm 
can deterministically approximate the result of Equation 1 in time i^-polylogCAT). Second, CM use a fc-strongly 
selective collection of sets [17] to construct their measurements for algebraically compressible signals. We introduce 
the generaUzed notion of a X-majority fc-strongly selective collection of sets which leads us to a new reconstruction 
algorithm with better algebraic compressibility results than CM's algorithm. As a result, our deterministic sub-linear 
time Fourier algorithm has better then previously possible algebraic compressibility behavior. 

The main contributions of this paper are: 

1. We present a new deterministic compressed sensing algorithm that both {i) improves on CM's algebraically 
compressible signal results, and (//) has comparable measurement/run time requirements to CM's algorithm for 
exponentially decaying signals. 

2. We present the first known deterministic sub-linear time sparse DFT In the process, we explicitly demonstrate 
the connection between compressed sensing and sub-linear time Fourier transform methods. 

3. We introduce K-majority A:-strongly selective collections of sets which have potential applications to streaming 

algorithms along the lines of [25, 13]. 

The remainder of this paper is organized as follows: In section 2 we introduce relevant definitions and terminol- 
ogy. Then, in section 3 we define K-majority fc-strongly selective collections of sets and use them to construct our 
compressed sensing measurements. Section 4 contains our new deterministic compressed sensing algorithm along 
with analysis of it's accuracy and run time. Finally, we present our deterministic sub-linear time Fourier algorithm in 
sections 5 and 5.1. Section 6 contains a short conclusion. 

2 Preliminaries 

Throughout the remainder of this paper we will be interested in complex-valued functions / : [0, 2n] — > C and 
signals (or arrays) of length N containing / values at various x e [0, 2n]. We shall denote such signals by A, where 
A(y) e C is the signal's complex value for all e [0, N - 1] c N. Hereafter we will refer to the process of either 
calculating, measuring, or retrieving the / value associated any A(/) e C from machine memory as sampling from / 
and/or A. Given a signal A we define its discrete L^-norm, or EucUdean norm, to be 



For any signal. A, its Discrete Fourier Transform (DFT), denoted A, is another signal of length N defined as follows: 



Furthermore, we may recover A from its DFT via the Inverse Discrete Fourier Transform (IDFT) as follows: 



N-1 




We will also refer to ||A||2 as A's energy. 





N-1 



We will refer to any index, a), of A as a frequency. Furthermore, we will refer to A{co) as frequency w's coefficient for 
each o) e [0, JV - 1]. Parseval's equality tells us that ||A||2 = ||A||2 for any signal. In other words, the DFT preserves 
Euclidean norm and energy. Note that any non-zero coefficient frequency will contribute to A's energy. Hence, we 
will also refer to |A(£li)P as frequency co's energy. If |A(ai)| is relatively large we'll say that co is energetic. 

Our algorithm produces output of the form (coi, Ci), . . . , {mb, Cg) where each (coj, Cj) e [0, N - 1] X C. We will 
refer to any such set of B < N tuples 



[{a}j,Cj) e[0,N-l]xC s.t. 1 < j < b] 



as a sparse Fourier representation and denote it with a superscript 's'. Note that if we are given a sparse Fourier 
representation, R , we may consider R to be a length-N signal. We simply view R as the N length signal 



6(7) 



^] Cj if(y,Q)eR 
otherwise 



for all j e [0,N - 1]. Using this idea we may, for example, compute R from R via the IDFT. 

A B term/tuple sparse Fourier representation is B-optimal for a signal A if it contains B of the most energetic 
frequencies of A along with their coefficients. More precisely, we'll say that a sparse Fourier representation 

r' = [{cvj, Cj) e [0,N - 1] X C s.t. 1 < / < b) 

is B-optimal for A if there exists a valid ordering of A's coefficients by magnitude 

|A(a;i)| > |A(a;2)| > • . . > |A(a;y)| > . . . > |A(a;N)l (3) 

so that ^{coi, A{a)i)) | 1 e [1,B]| = tt^. Note that a signal may have several B-optimal Fourier representations if its 
frequency coefficient magnitudes are non-unique. For example, there are two 1-optimal sparse Fourier representations 
for the signal 

2nij inij 

A{j) = 2e~ + 2e~, N > 2. 

However, all B-optimal Fourier representations, Ropj, for any signal A will always have both the same unique URoptlh 
and ||A — Roptll2 values. 

We continue with two final definitions: Let oji, be a fc* most energetic frequency as per Equation 3. We will say 
that a signal A is (algebraically) p-compressible for some p > 1 if |A(a;b)| = 0{b~P) for all b e [1,N). If R^pj is a 
B-optimal Fourier representation we can see that 

l|A - Roptii = g \A{co,)\l = O b-^mj = OiB'-'P). (4) 

Hence, any p-compressible signal A (i.e., any signal with a fixed c e K so that |A(iD;,)|2 < c ■ b~P for all b e [1,N)) 
will have ||A - R°f\\l < Cp • B^'^P for some Cp e M. For any p-compressible signal class (i.e., for any choice of p 

and c) we will refer to the related optimal 0(B^~^'')-size worst case error value (i.e.. Equation 4 above) as HCg^'Hj. 
Similarly, we define an exponentially compressible (or exponentially decaying) signal for a fixed a to be one for 
which |A(a;i,)l = 0(2""''). The optimal worst case error is then 

llCl|2 = o|j^ 4-'"'db'j = 0{A-"\ (5) 

Fix 6 small (e.g., 6 - 0.1). Given a compressible input signal. A, our deterministic Fourier algorithm will identify 
B of the most energetic frequencies from A and approximate their coefficients to produce a Fourier representation R 
with||A-R||2 < ||A-Ropt||2 + 6||C°P'||2. These are the same types of compressible signal results proven by CM [6, 7]. 



3 Construction of Measurements 



We will use the following types of subset collections to form our measurements: 

Definition 1 A collection, S, ofs subsets of [0, N) is called K-majority k-strongly selective if for any X c [0, N) with 
\X\ < k, and for all x e X, the following are true: (i) x belongs to K subsets in S and, (ii) more than two-thirds of 
Sj e S containing x are such that Sj nX = \x} (i.e., every member ofX occurs separated from all other members of 
X in more than two-thirds of the K S-subsets it belongs to). 

A 1-majority fc-strongly selective collection of sets is an example of a fc-strongly selective collection of sets [17, 
26]. Note that a J<C-majority fc-strongly selective collection of subsets contains many fc-strongly selective collections of 
subsets (i.e., it has repeated strong selectivity). Thus, our newly defined i<C-majority fc-strongly selective collections are 
help us count how many times each small subset element is isolated. This added structure allows a new reconstruction 
algorithm (Algorithm 1) with better algebraic compressibility properties than previous methods. 

Next, we will build O(logAf) X-majority fc-strongly selective collections of subsets. Each of these O(logN) col- 
lections will ultimately be used to determine energetic frequencies modulo a small prime < N. These moduU will 
then be used along with the Chinese Remainder Theorem to reconstruct each energetic frequency in a manner akin to 
the introduction's motivating example. Our technique is motivated by the method of prime groupings first employed 
in [25]. To begin, we will denote each of the O(IogN) collections of subsets by Si where < / < 0(log]V). We 
construct each of these X-majority fc-strongly selective collections as follows: 

Define po = l and let 

pi,p2,---,Pl,---,Pm 

be the first m primes where m is such that 




Hence, pi is the prime natural number and we have 

po = l,pi = 2,p2 = 3,p3 = 5,...,pm = O(mlogm). 

Note that we know p^ = 0{m log m) via the Prime Number Theorem, and so pm = 0(log N log log N). Each pi will 
correspond to a different K-majority fc-strongly selective collection of subsets of [0, N) = (0, . . . , N - 1). 
Along the same lines we let qi through be the first K (to be specified later) consequitive primes such that 

max{pm, k)<q-i<q2< ...<qK- 

We are now ready to build So, our first JC-majority k-strongly selective collection of sets. We begin by letting Soj,h for 
all 1 < / < K and < /z < - 1 be 

So,j,h = {n e [0,N)\n = h mod qj]. 
Next, we progressively define So,y to be all integer residues mod qj, i.e., 

So,j = {So,jji I h e [0,qj)}, 

and conclude by setting So equal to all K such qj residue groups: 

So = U^jSoj. 

More generally, we define Si for < Z < m as follows: 

Si = {{n e[0,N)\n = h mod piqj] \ h e [0,piqj)] . 



Lemma 1 Fix k. If we set K > 3{k - l)[logj^NJ + 1 then So will be a K-majority k-strongly selective collection of 
sets. Furthermore, ifK = O(fclogj^N) then \So\ = O^fc^log^N • max(logfc,loglogj.N)). 



Proof: 



Let X c {Q,N) be such that |X| < fc. Furthermore, let ;c, y e X be such that x y. 'Ry the Chinese Remainder 
Theorem we know that x and y may only collide modulo at most [logj, NJ of the K (j-primes qK > ■ ■ ■ > (^i > Hence, 
X may collide with all the other elements of X (i.e., with X- {x}) modulo at most {k— l)\\og^ Nj (^-primes. We can now 
see that x will be isolated from aU other elements of X modulo at least X- (fc- l)Llogj^ NJ > 2(fc- l)Llogj. N\ + l> ^ 
(^-primes. This leads us to the conclusion that So is indeed K-majority fc-strongly selective. 

Finally, we have that 

K 

\So\<Y,1i^K-qK. 

;=i 

Furthermore, given that K > max{k,m), the Prime Number Theorem tells us that qK = 0{KlogK). Thus, we can see 
that So will indeed contain O (j^ log^ N ■ max(log fc, log log^. Nfj sets. □ 

Note that at least 0(fc logj. N) primes are required in order to create a (X-majority) fc-strongly separating collection 
of subsets using primes in this fashion. Given any x e [0, IV) a fc — 1 element subset X can be created via the Chinese 
Remainder Theorem and x moduli so that every element of X collides with x in any desired 0(logj^ N) q-pnmes. We 
next consider the properties of the other m collections we have defined: Si, . . ■ ,Sm- 

Lemma 2 Let Sij^i, = {n e [0,N) \ n = h mod piqj], X c [0, N) have < k elements, and x e X. Furthermore, suppose 
that Soj^h n X = {x}. Then, for all I e [1, m], there exists a unique h e [0, pi) so that Sij^k+b-qj H X = {x}. 

Proof- 
Fix any / e [1, m]. Soj> n X = {x} implies that x = h + a ■ qj for some unique integer a. Using fl's unique rep- 
resentation modulo pi (i.e., a = b + c ■ pi) we get that x = h + b-qj + c- qjpi. Hence, we can see that x e S^h+bqj- 
Furthermore, no other element of X is in Sij^h+t-qj for any t e [0, pi) since it's inclusion therein would imply that it was 
also an element of So,j,h- ^ 

Note that Lemma 2 and Lemma 1 together imply that each S\,..., Sm is also a K-majority fc-strongly separating 
collection of subsets. Also, we can see that if x £ S^h+b-qj we can find x mod pi by simply computing h + bqj mod p;. 
Finally, we form our measurement matrix. 

Set iS = To form our measurement matrix, M, we simply create one row for each Sij^h ^ Shy computing 

the N-length characteristic function vector of S;y;,, denoted XS/,,,- This leads to A\ being a O(fc^) x N measurement 
matrix. Here we bound the number of rows in A1 by noting that: (i) \S\ < m ■ K ■ Pm^Ky (H) m = O(logN), {Hi) 
pm = 0(logN • loglogN), (iv) K = O(fclogN), and (v) qK = OiKlogK). 



4 Signal Reconstruction from Measurements 

Let A be an N-length signal of complex numbers with it's N entries numbered through N — 1. Our goal is to 
identify B of the largest magnitude entries of A (i.e., the first B entries in a valid ordering of A as in Equation 3) and 
then estimate their signal values. Toward this end, set 

(6) 

V2C 

where C > 1 is a constant to be specified later, and let B' be the smallest integer such that 

N-l 



X |A(a;i,)| < |. (7) 



b=B' 



Algorithm 1 Sparse Approximate 

1: Input: Signal A, integers B, B' 

2: Output: R , a sparse representation for A 

3: Initialize «- 

4: SetX = 3B'Llogg,NJ + l 

5: Form measurement matrix, Ai, via K-majority B'-strongly selective collections (Section 3) 

6: Compute M ■ A 

Identification 

7: for / from 1 to K do 

8: Sort <Xso,.o/ A), <Xso./,i/ A), . . . , {xso,i,y-u^) by magnitude 

9: for b from 1 to B' + 1 do 

10: kjj, <— largest magnitude (jso,./ A)-measurement 

11: %b <~ kj,bS associated residue mod qj (i.e., the • in (xso^,'^)) 

12: for / from 1 to m do 

13: Un ^ mmte[0,p,) \kj,b - {XS,,i,,,.,r„_^ , A)| 

14: r;,i, <- ro,i, + W • (\] mod p; 

15: end for 

16: Construct (Uy^j, from ro,;,, . . . , Tm,;, via the Chinese Remainder Theorem 

17: end for 

18: end for 

19: Sort iyy,i,'s maintaining duplicates and set C(&;y,i,) = the number of times iuy,;, was constructed via Une 16 

Estimation 

20: for / from 1 to X do 

21: for h from 1 to B' + 1 do 

22: if C{ixij^) > ^ then 

23: C(aiy,i,) <- 

24: X = median{real(fcy',!,')|cLiy',i,' = Mj^} 

25: 1/ = median{imag(A:y/^fc')|a)y',i,' = coj^] 

26: R*" <- R' U {{cL)j^i„x + iy)) 

27: end if 

28: end for 

29: end for 

30: Output B largest magnitude entries in R 



Note that B' is defined to be the last possible significant frequency (i.e., with energy > a fraction of |A((yB)|). We 
expect to work with sparse/compressible signals so that B < B' N. Later we will give specific values for C and B' 
depending on B, the desired approximation error, and A's compressibility characteristics. For now we show that we 
can identify/approximate B of A's largest magnitude entries each to within e-precision via Algorithm 1. 

Algorithm 1 works by using Sq measurements to separate A's significantly energetic frequencies Q - {coi, . . . , cob'] C 
[0,N). Every measurement which successfully separates an energetic frequency coj from all other members of Q will 

both (/) provide a good (i.e., within | < ^^^^) coefficient estimate for coj, and {ii) yield information about cofs iden- 
tity. Frequency separation occurs because our So measurements can't colUde any fixed ooj e Q with any other member 

of Q modulo more then (B' — 1) logg, N (/-primes (see Lemma 1). Therefore, more than I of <So's 3B' logg, N+1 
(^-primes will isolate any fixed coj e Q. This means that our reconstruction algorithm will identify all frequencies at 
least as energetic as ojb at least IB' logg, N + 1 times. We can ignore any frequencies that aren't recovered this often. 
On the other hand, for any frequency that is identified more then IB' logg, N times, at most B' logg, N of the measure- 
ments which lead to this identification can be significantly contaminated via collisions with O members. Therefore, 
we can take a median of the more than IB' logg, N measurements leading to the recovery of each frequency as that 



Algorithm 2 Fourier Measure 
1: Input: /-samples, integers m, K 
2: Output: < xs,,,,// >-measurements 

3: Zero a 0(ijjcpm)-element array, A 
4: for j from 1 to K do 
5: for I from 1 to m do 

7: Calculate A via FFT 

8: < Xs,,ij,,f ><- Mh) for each h e [0, (/yp;) 

9: end for 

10: end for 

11: Output < xs/,y;,// >-nieasurements 



frequency's coefficient estimate. Since more than half of these measurements must be accurate, the median will be 
accurate. The following Theorem is proved in the appendix. 

Theorem 2 Let Ropt be a B-optimal Fourier representation for our input signal A. Then, the B term representation Pt 

returned from Algorithm 1 is such that \\A—R\\^ < ||A — /?optll2 + I^^^b)! _ Furthermore, Algorithm 1 's Identification 
and Estimation (lines 7 - 30) run time is 0{B'^ log^ N). The number of measurements used is 0(B'^ log^ N). 

Theorem 2 immediately indicates that Algorithm 1 gives us a deterministic 0(B^ log^ ]V)-measurement, 0{B^ log"* N)- 
reconstruction time method for exactly recovering B-support vectors. If A is a B-support vector then setting B' = 
B + 0(1) and C = 1 will be sufficient to guarantee that both |A(ii;b)P = and YJb=B' \^{'^b)\ = are true. Hence, we 
may apply Theorem 2 with B' = B + Oil) and C = 1 to obtain a perfect reconstruction via Algorithm 1. However, we 
are mainly interested in the more reaUstic cases where A is either algebraically or exponentially compressible. The 
following theorem (proved in the appendix) presents itself. 

Theorems Let A be p-compressible. Then, Algorithm 1 can return a B term sparse representation with \\A - 
R\\l < ||A -l?optll2 + 5||Cg^'ll2 using olsT^d^ log'^wj total identification/estimation time andO^B'^d^ log^wj 
measurements. If A decays exponentially. Algorithm 1 can return a B term sparse representation, R^, with WA—RW^ < 
\\A — RoptWj + 5||Cg'"|l2 using both (b^ + log^ 6"^) • polylog(N) measurements and identification/estimation time. 

( ^ _6_ . \ 

For p-compressible signals, p > 2, CM's algorithm [6, 7] takes 0\Br-^6^-p log Nl- identification/estimation time 

and OIBp-2g2-p jgg N l-measurements to achieve the same error bound. As a concrete comparison, CM's algorithm 

requires 0(B^*6~^ log^ N)- identification/estimation time and 0(B^^(5~'* log* N)-measurements for 3-compressible 
signals. Algorithm 1 , on the other hand, requires only 0(B^6~^ log"* N)- identification/estimation time and 0(B^6~^ log^ Ad- 
measurements. Hence, we have improved on CM's algebraic compressibility results. All that's left to do in order to 
develop a deterministic sub-hnear time Fourier algorithm is to compute our CS Fourier measurements (Algorithm 1 
lines 1 - 6) in sub-linear time. 

5 Sub-linear Time Fourier Measurement Acquisition 

Our goal in this section is to demonstrate how to use Algorithm 1 as means to approximate the Fourier transform 
of a signal/function / : [0, 27t] C, where (/) / has an integrable p**^ derivative, and (//) /(O) = f{2n),f'{0) = 
f'{2n), . . .,f^P~^\0) = f'-P~^\2'n). In this case we know the Fourier coefficients for / to be p-compressible [3, 12]. 
Hence, for N = qi-pi ■ ■ -pm sufficiently large, if we can collect the necessary Algorithm 1 (line 5 and 6) measurements 
in sub-hnear time we wiU indeed be able to use Algorithm 1 as a sub-linear time Fourier algorithm for /. 



Note that in order to validate the use of Algorithm 1 (or any other sparse approximate Fourier Transform method 
[15, 16]) we must assume that / exhibits some multi-scale behavior. If / contains no unpredictably energetic large 
(relative to the number of desired Fourier coefficients) frequencies then it is more computationally efficient to simply 
use standard FFT/USFFT methods [5, 22, 1, 10, 11]. The responsible user, therefore, is not entirely released from the 
obligation to consider /'s Hkely characteristics before proceeding with computations. 

Choose any Section 3 (j-prime qj, j e [1, K], and any p-prime pi with / e [0, m]. Furthermore, pick h e [0, qjpi). 
Throughout the rest of this discussion we will consider / to be accessible to sampling at any desired predetermined 

positions f e [0, 2n]. Given this assumption we may sample / at t = 0, ^^^Y' in order to perform the 

following DFT computation: 



1 "-sr r/2Hfc\ ^ 



k=0 

Via aliasing [3] this reduces to 

1 I2nk\ ^ 1 ''-^U ^ ^ 2^) ^ 1 » ^ ^i^!;! 

iiHi xiiF'/ iiyi t=o V<i>=-oo ) liy^ co=-c<, k=o i^^h mod CI fp, 

Using Sections 3 and 4 we can see that these measurements are exactly what we need in order to determine B of the 
most energetic frequencies of / modulo N = qi-pi - ■ -pm (ie., B of the most energetic frequencies of /'s band-limited 
interpolant's DFT). 

We are now in the position to modify Algorithm 1 in order to find a sparse Fourier representation for /. To do so 
we proceed as follows: First, remove lines 5 and 6 and replace them with Algorithm 2 for computing all the necessary 
< XSijH'f >-measurements. Second, replace each "< Js;,;,/ A >" by "< XSij^,/ >" in Algorithm I's IDENTIFICATION 
section. It remains to show that these Algorithm 1 modifications indeed yield a sub-linear time approximate Fourier 
transform. The following theorem presents itself (see appendix for proof): 

Theorem 4 Letf : [0, 2n]^Chave (;) an integrable derivative, and (it) /(O) = /(2ti), /' (0) = /' (27t), . . . , /(P-^) (0) 
f^~^\2n) for some p > 1. Furthermore, assume that f's B' = O^B~e~ j largest magnitude frequencies all belong 
to [tJ]- ^'^y Algorithm 1 to return a B term sparse Fourier representation, R^, for f such that 

ll/- R\\\ < 11/ - -Roptll2 + 6||C°P'||2 using O ^B^6^ log^ Nj-time and O ^B^6^ log'' Nj-measurements from f. 

If / : [0, 2n] — > C is smooth (i.e., has infinitely many continuous derivatives on the unit circle where is identified 
with 2n) it follows from Theorem 4 that Algorithm 1 can be used to find an 6-accurate, with 6 = C)(^)> sparse 

B-term Fourier representation for / using 0(B'^)-time/measurements. This result differs from previous sub-linear time 
Fourier algorithms [15, 16] in that both the algorithm and the measurements/samples it requires are deterministic. 
Recall that the deterministic nature of the algorithm's required samples is potentially beneficial for failure intolerant 
hardware. In signal processing applications the sub-Nyquist sampling required to compute Algorithm I's < XSy;,/ / >- 
measurements could be accomplished via 0(B) parallel low-rate analog-to-digital converters. 

5.1 DFT from Inaccessible Signal Samples 

Throughout the remainder of this section we will consider our N- length compressible vector A to be the product of 
the N X N DFT matrix, ^, and a non-sparse N-length vector A. Thus, 

A = ^A. 

Furthermore, we will assume that A contains equally spaced samples from some unknown smooth function / : 
[0,27t] — > C (e.g., A's band-limited interpolent). Hence, 



We would like to use our modified Algorithm 1 along with Algorithm 2 to find a sparse Fourier representation for A. 
However, unless ^ e N for all qjpj-pairs (which would imply / had been grossly oversampled), A won't contain all 
the /-samples required by Algorithm 2. Not having access to / directly, and restricting ourselves to sub-linear time 
approaches only, we have little recourse but to locally interpolate / around our required samples. 

For each required Algorithm 2 /-sample at f = /j e [0, qjPi), we may approximate /(f) to within 0(N~^'^)-error 
by constructing 2 local interpolents (one real, one imaginary) around t using A's nearest 2k entries [14]. These errors 
in /-samples can lead to errors of size 0{N~^^ ■ pmCjKiogpmCjK) in our < XSi,j,h'f > calculations. However, as long as 
the < XSij,,,f >-measurement errors are small enough (i.e., of size 0{6 ■ B~P) in the p-compressible case) Theorem 4 
and all related Section 4 results and will still hold. Using the proof of Theorems 2 and 3 along with some scratch work 
we can see that using 2k = 0(log + p) interpolation points per /-sample ensures all our < Xs,,jj,ff >-measurement 
errors are 0(6 • B~f ). We have the following result: 

Theorem 5 Let A = WA be p-compressible. Then, we may use Algorithms 1 and 2 to return a B term sparse 
representation, R^, for A such that \\A - R\\'^ < \\A - l?optll2 + ^llCg^'llj using O^Br^6^Q.og6~^ +pf^-time and 

O |bFi 6T=p (log + p)^-samples from A. 

Notice that Theorem 5 no longer guarantees an 6 = 0(^)-accurate 0(B^)-time DFT algorithm for smooth data (i.e., 
A's containing samples from a smooth function /). This is because as p — > oo we require an increasingly large number 
of interpolation points per /-sample in order to guarantee our < XSijf,,f > -measurements remain 0(6 • B~'')-accurate. 

However, for 6 = 0(log~^ N), we can still consider smooth data A to be 0(logN)-compressible and so achieve a 
0(B2)-time DFT algorithm. 

6 Conclusion 

Compressed Sensing (CS) methods provide algorithms for approximating the result of any large matrix multipUca- 
tion as long as it is known in advance that the result will be sparse/compressible. Hence, CS is potentially valuable 
for many numerical applications such as those involving multi-scale aspects [8, 18]. In this paper we used CS meth- 
ods to develop the first known deterministic sub-linear time sparse Fourier transform algorithm. In the process, we 
introduced a new deterministic Compressed Sensing algorithm along the lines of Cormode and Muthukrishnan (CM) 
[6, 7]. Our new deterministic CS algorithm improves on CM's algebraic compressibility results while simultaneously 
maintaining their results concerning exponential compressibility. 

Compressed Sensing is closely related to hashing methods, combinatorial group testing, and many other algorithmic 
problems [25, 13]. Thus, K-majority A:-strongly selective collections of sets and Algorithm 1 should help improve 
results concerning algebraically compressible (at each moment in time) stream hashing/heavy-hitter identification. 
Further development of these/other algorithmic applications is left as future work. It is also worthwhile to note that 
Monte Carlo Fourier results similar to those of [16] may be obtained by altering our measurement construction in 
Section 3. If we construct our Si collections by using only a small subset of randomly chosen qfs we will still locate 
all sufficiently energetic entries of A with high probability. The entries' coefficients can then be approximated by 
standard USFFT techniques [16, 10, 11, 22]. 

7 Acknowledgments 

We would like to thank Graham Cormode and S. Muthukrishnan for answering questions about their work. We 
would also like to thank Martin Strauss, Anna Gilbert, Joel Lepak, and Hualong Feng for helpful discussions, advice, 
and comments. 



References 



[1] C. Anderson and M. D. Dahleh. Rapid computation of the discrete Fourier transform. SI AM J. Sci. Comput., 
17:913-919, 1996. 

[2] L. I. Bluestein. A Linear Filtering Approach to the Computation of Discrete Fourier Transform. IEEE Transac- 
tions on Audio and Electwacoustics, 18:451^55, 1970. 

[3] J. P. Boyd. Chebyshev and Fourier Spectral Methods. Dover PubUcations, Inc., 2001. 

[4] E. Candes, J. Romberg, and T. Tao. Robust uncertainty principles: Exact signal reconstruction from highly 
incomplete frequency information. IEEE Trans. Inform. Theory, 52:489-509, 2006. 

[5] J. Cooley and J. Tukey. An algorithm for the machine calculation of complex Fourier series. Math. Comput, 

19:297-301, 1965. 

[6] G. Cormode and S. Muthukrishnan. Combinatorial Algorithms for Compressed Sensing. Technical Report 
DIM ACS TR 2005-40, 2005. 

[7] G. Cormode and S. Muthukrishnan. Combinatorial Algorithms for Compressed Sensing. Conference on Infor- 
mation Sciences and Systems, March 2006. 

[8] I. Daubechies, O. Runborg, and J. Zou. A sparse spectral method for homogenization multiscale problems. 
Multiscale Model. Sim., 2007. 

[9] R. A. DeVore. Deterministic constructions of compressed sensing matrices, http://www.ima.umn.edu/2006- 

2007/ND6.4-15.07/activities/DeVore-Ronald/Henrykfinal.pdf 2001. 

[10] A. Dutt and V. RokhUn. Fast Fourier transforms for nonequispaced data. SIAMJ.Sci. Comput., 14:1368-1383, 
1993. 

[1 1] J. A. Fessler and B. P. Sutton. Nonuniform Fast fourier transforms using min-max interpolation. IEEE Trans. 
Signal Proc, 51:560-574, 2003. 

[12] G. B. FoUand. Fourier Analysis and Its Applications. Brooks/Cole Publishing Company, 1992. 

[13] S. Ganguly and A. Majumder. CR-precis: A deterministic sunmiary structure for update data streams. ArXiv 
Computer Science e-prints, Sept. 2006. 

[14] C. F. Gerald and P. O. Wheatiey. Applied Numerical Analysis. Addison- Wesley Publishing Company, 1994. 

[15] A. Gilbert, S. Guha, P. Indyk, S. Muthukrishnan, and M. Strauss. Near-optimal sparse Fourier estimation via 
sampUng. ACM STOC, pages 152-161, 2002. 

[16] A. Gilbert, S. Muthukrishnan, and M. Strauss. Improved time bounds for near-optimal sparse Fourier represen- 
tations. SPIE, 2005. 

[17] P. Indyk. Explicit constructions of selectors and related combinatorial structures, with applications. In SODA '02: 
Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms, pages 697-704, Philadel- 
phia, PA, USA, 2002. Society for Industrial and AppUed Mathematics. 

[18] M. A. Iwen. UnpubUshed Results, http://www-personal.umich.edu/markiwen/. 

[19] M. A. Iwen, A. C. Gilbert, and M. J. Strauss. Empirical evaluation of a sub-linear time sparse DFT algorithm. 
Submitted for Publication, 2007. 

[20] S. Kirolos, J. Laska, M. Wakin, M. Duarte, D. Baron, T. Ragheb, Y. Massoud, and R. Baraniuk. Analog-to- 
information conversion via random demodulation. Proc. IEEE Dallas Circuits and Systems Conference, 2006. 



[21] J. Laska, S. Kirolos, Y. Massoud, R. Baraniuk, A. Gilbert, M. Iwen, and M. Strauss. Random sampling for 
analog-to-information conversion of wideband signals. Proc. IEEE Dallas Circuits and Systems Conference, 
2006. 

[22] J.-Y. Lee and L. Greengard. The type 3 nonuniform FFT and its applications. J Comput. Phys., 206:1-5, 2005. 

[23] M. Lustig, D. Donoho, and J. Pauly. Sparse MRI: The application of compressed sensing for rapid MR imaging. 
Submitted for publication, 2007. 

[24] R. Maleh, A. C. Gilbert, and M. J. Strauss. Signal recovery from partial information via orthogonal matching 
pursuit. IEEE Int. Conf. on Image Processing, 2007. 

[25] S. Muthukrishnan. Data Streams: Algorithms and Applications. Foundations and Trends in Theoretical Com- 
puter Science, 1, 2005. 

[26] S. Muthukrishnan. Some Algorithmic Problems and Results in Compressed Sensing. Allerton Conference, 2006. 

[27] L. Rabiner, R. Schafer, and C. Rader. The Chirp z-Transform Algorithm. IEEE Transactions on Audio and 
Electroacoustics, AU-17(2):86-92, June 1969. 

[28] J. Tropp and A. Gilbert. Signal recovery from partial information via orthogonal matching pursuit. Submitted 
for Publication, 2005. 



A Proof of Theorem 2 



We begin by proving two lemmas. 



Lemma 3 IDENTIFICATION: Lines 7 through 19 of Algorithm 1 are guaranteed to recover all valid a)\, . . . ,0)^ (i.e., 
all CO with |A(a;)|2 > |A((yB)|2 - there may be> B such entries) more then ^ times. Hence, despite line 22, an entry 
for all such cob, 1 <b < B, will be added to in line 26. 

Proof: 

Because of the construction of Sq (i.e., proof of Lemma 1) we know that for each b e [1, B] there exist more then 
^ subsets S eSo such that S n {co},> \ V e [1, B']} = {coi,}. Choose any b e [1, B]. Denote the ^-primes that isolate M}, 
from all of a!\, cd^-i, coi,+i, (jg' by 

2K 

We next show that, for each /c' e we get < Xs a ,A > as one of the B' -H 1 largest magnitude 

o.ivh moa 

< XSo/j,, / A >-measurements identified in line 10. 
Choose any fc' e [1, K'\. We know that 

N-l 

I < ^ < |A(a;B)| - V2 2^ |A(a;i,OI < \H(^b)\ - Yu ^^'^^'^ 



< Xs J /A > 

o,;v«,j mod ,y 



We also know that the (B' -I- 1)^' largest measurement L^-magnitude must be < | . Hence, we are guaranteed to execute 
Unes 12-15 with an vq. = coi, mod qj,^ . 
Choose any / e [1, m] and set 

Q = ^cob' I b' e [B',N), coi,, = cob mod qj^, cob- ^ cob mod 



Using Lemma 2 we can see that line 13 inspects all the necessary residues of coi, mod piqj^ . To see that tmin will be 
chosen correctly we note first that 



< Xs 



mod( 



,A>- <xs 



,A > 



Furthermore, setting ro,. = co\, mod cjj^, and 

Q' = \oji,' I h' e [B',N), oj;,' = oji, mod (jy^,, wi,' ^ {ra,. + tqj^,) mod for some f with {rQ^. + tqj^,) ^ a;;, mod 
we have 

N-l 

Finally we can see that 



|A(a;i,)| 



mod 



lit' 



,A>-<xs 



,A> 



Hence, lines 13 and 14 will indeed select the correct residue for cob modulo pi. Therefore, line 16 will correctly recon- 



struct CO}, at least K' > ^ times. □ 



Lemma 4 ESTIMATION: Every {co, Aa) stored in R in line 27 is such that \A{co) - A.(o\2 < £■ 
Proof: 

Suppose that {co, Aco) is stored in R^. This only happens if A(ai) has been estimated by 

<^\.mod,/^> = Z 

6j=a> mod qj 

for more then ^ (jy-primes. The only way that any such estimate can have |A(cl')- < Xs / ^ > li ^ 

is if 0) collides with one of a;i, . . . , cob' modulo qj (this is due to the definition of B' in Equation 7). By the proof of 

Lenmia 1 we know this can happen at most B'Llogg, NJ < | times. Hence, more then half of the ^ estimates, A^, 

X f 

must be such that |A(cl)) - A^Ji < It follows that taking medians as per lines 24 and 25 will result in the desired 
e-accurate estimate for A(a;). □ 

We are now ready to prove Theorem 2. 

Theorem 2 LetRopt be a B-optimal Fourier representation for our input signal A. Then, the B term representation R^ 

returned from Algorithm 1 is such that \\A—R\\^ < \\A- /?optll2 + ^^'^^""^^ . Furthermore, Algorithm 1 's Identification 
and Estimation (lines 7 - 30) run time is 0(B'^ log^ N). The number of measurements used is 0(B'^ log^ N). 

Proof- 
Choose any b e (0, B]. Using Lemmas 3 and 4 we can see that only way some coi, ^ Rg is if there exists some 
associated b' e (B, N) so that cvt' e R' and 



|A(a;B)| + e > |A(a;i,OI +e> \A^,, I > |A<„J > |A(a;i,)| - e > \A{cob)\ - e. 



In this case we'll have 2e > |A(a;i,)| — |A(cL)i,')| > so that 

iMcoh'f + 4e (e + |A(a;B)|) > \A{(Ob'f + 4e (e + \Mcoh')\) > \M(Obf- (8) 
Now using Lemma 4 we can see that 

||A-R|p= E |AM-^-l'< E \Mcof + B-e\ 

Furthermore, we have 

(<i),-)?ft' be(p,B], (Ot^tC b'€(B,N), a)i,itC 

Using observation 8 above we can see that this last expression is bounded above by 

B ■ {5e^ + Ae\A{a)B)\) + ^ \M(^b'f+ X |A(a;hOl' < l|A - Roptll2 + B • (Se^ + 4e|A(a)B)l). 

b'€[B,N), cot, etc b'€(B,N), cot, itC 

Substituting for e (see Equation 6) gives us our result. Mainly, 

B . (5^ . 4.|A(..,|, = ^ ( A . 2 VJ) < 

We next focus on run time. Algorithm I's Identification (i.e., lines 7 through 19) run time is dominated by the 
0{KB'm) executions of line 13. And, each execution of line 13 takes time 0(p„,). Hence, given that m - O(logN), 
Pm — 0(logJV • log log N), and K = 0{B' logg, N), we can see that Identification requires 0(B'2-logB,N-log^N- 
loglogN)-time. 

Continuing, Algorithm I's Estimation (i.e., lines 20 through 30) run time is ultimately determined by line 22's 
iF-statement. Although line 22 is executed 0{KB') = 0{B'^ logg, N) times, it can only evaluate to true O(B') times. 
Hence, each line 24/25 0{B' logg, N log B')-time median operation will be evaluated at most O(B') times. The result- 
ing Estimation runtime is therefore 0(B'^ logs' N^logB'). 

To bound the number of measurements we recall that: (/) the number of measurements is < m ■ K ■ pmi]K, (iO 
m = O(logN), (iii) p^ = 0{logN ■ log log N), (iv) K = 0(B' logN), and {v) = O(KlogK). Hence, the number 
of measurements is O (ji? log K log^ N log log n). Substituting for K gives us the desired bound. □ 



B Proof of Theorem 3 

Theorem 3 Let A be p-compressible. Then, Algorithm 1 can return a B term sparse representation with \\A—R\\^ < 
\\A — /?optll2 + SllCg^'llj using 0^~6~ log^wj total identification/estimation time and O^B~6~ log'' mea- 
surements. If A decays exponentially. Algorithm 1 can return a B term sparse representation, R' , with \\A — R\\^ < 
l|A - /?optll2 + ^llCg'' II2 using both {b^ -t log^ 6^) ■ polylog(N) measurements and identification/estimation time. 

Proof: 

We first deal with the algebraically compressible case. We have to determine our Algorithm 1 's B' and Theorem 2's 
C variables. Moving toward that goal we note that 

^^l^ = lo(B-^-) = o(l)|lCrili 



After looking at Theorem 2 we can see that we must use C = O and a B' (see Equations 6 and 7) so that 

N-l 

\A{coi,)\ = 0(6 • |A(a;B)|) = 0(6 • B-P). 

b=B' 

Continuing, 

N— 1 y ^co \ 

g |A(a;i,)|2 = O b-Mbj = O(B'i-P). 

(1 JL\ 
6'^Bp M. Applying Theorem 2 gives us Algorithm I's runtime and number of required 

measurements. 

We next deal with the exponentially compressible case. Now, as before, we have to determine our Algorithm 1 's B' 
and Theorem 2's C variables. To do so we note that 

5^J^=«0(4-) = 0(§)||C"/'|g. 
After looking at Theorem 2 we can see that we must use C = O (|) and a B' so that 

Z = O (^-^) = O (6 • 2--'°^^) . 

fi=R' V ' 



N-l 



b=B' 

Continuing, 

2^ |A(a;i,)|2 = O Y,^-"' =0(2-««'). 

b=B' \b=B' ) 

Hence, we must use B' = O^B + log(|)° j. Applying Theorem 2 gives us Algorithm I's runtime and number of 
required measurements. □ 

C Proof of Theorem 4 

Theorem A Let f : [0, In] C have {i) an integrable derivative, and (ii) /(O) = f{2n), f'{0) = f'{2n), . . ., fP'^^ (0) 
f^~^\2n) for some p > 1. Furthermore, assume that f's B' = O^B~e~^ largest magnitude frequencies all belong 
to (-[y j/ [tJ]- '^^y Algorithm 1 to return a B term sparse Fourier representation, k\ for f such that 

\\f-R\\l < ll/--Roptll2 + i5||C°P'||2 using o{B7^^d^p log^ N^-time and O (b^ d'^p \og^ N^-measurements from f . 

Proof: 

We note that Theorems 2 and 3 stUl hold for p-compressible infinite signals/vectors A (i.e., signals with oo-length). 
For the purposes of proof we may consider A to be formed by any bijective mapping g : N — > Z so that both 

and 

g{n) = n mod N, for all n e N, 

are true. We then set 

A(n) = f{g{n)) , for all n e N. 





-N- 




N 




(- 


"2 




.2. 





In this case we note that f's B' largest magnitude frequencies belonging to [yj] implies our that iC-majority 

B'-strongly separating collection of (infinite) subsets, S, wiU still correctly isolate all the B' most energetic frequency 
positions in A. 

Continuing, we may still consider our infinite length p-compressible, with p > 1, signal A (i.e., the Fourier co- 
efficient series /) to be sorted by magnitude for the purpose of identifying valid coi,Cl>2, etc.. Furthermore, we may 
bound the (now infinite) sums of A's entries' magnitudes by the same integrals as above. The proofs of the the- 
orems/supporting lemmas will go through exactly as before if we consider all the finite sums in their proofs to be 
absolutely convergent infinite sums. The only real difference from our work in Section 4 is that we are computing our 
< XSi A >-measurements differently. 

I 2L 6 \ 

Similar to Theorem 2, the required number of measurements from / will be OyBf-^e^ i' log N\. This is exactly 
because for each {qj, p/)-pair we compute all the measurements 

[<Xs,j^,f> \ he[0,qjpi)] 

via one FFT requiring qjpi samples. Hence, the number of samples from / we use is once again bounded above by 
m- K- pm^jK- Furthermore, each of the mK FFT's will take 0{pmqK logPm'?K)-tiine (despite the signal lengths' factor- 
izations [2, 27]). The result follows. □ 



